Overview

Dataset statistics

Number of variables10
Number of observations53930
Missing cells0
Missing cells (%)0.0%
Duplicate rows146
Duplicate rows (%)0.3%
Total size in memory3.9 MiB
Average record size in memory76.0 B

Variable types

NUM10

Warnings

Dataset has 146 (0.3%) duplicate rows Duplicates
price is highly correlated with caratHigh correlation
carat is highly correlated with price and 3 other fieldsHigh correlation
x is highly correlated with carat and 2 other fieldsHigh correlation
y is highly correlated with carat and 2 other fieldsHigh correlation
z is highly correlated with carat and 2 other fieldsHigh correlation

Reproduction

Analysis started2020-09-06 22:33:21.422037
Analysis finished2020-09-06 22:33:48.434916
Duration27.01 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

carat
Real number (ℝ≥0)

HIGH CORRELATION

Distinct273
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.7979757093
Minimum0.2
Maximum5.01
Zeros0
Zeros (%)0.0%
Memory size421.3 KiB
2020-09-06T17:33:48.575831image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.2
5-th percentile0.3
Q10.4
median0.7
Q31.04
95-th percentile1.7
Maximum5.01
Range4.81
Interquartile range (IQR)0.64

Descriptive statistics

Standard deviation0.4740348567
Coefficient of variation (CV)0.594046725
Kurtosis1.25617613
Mean0.7979757093
Median Absolute Deviation (MAD)0.32
Skewness1.116517792
Sum43034.83
Variance0.2247090454
MonotocityNot monotonic
2020-09-06T17:33:48.783703image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0.326044.8%
 
0.3122494.2%
 
1.0122424.2%
 
0.719813.7%
 
0.3218393.4%
 
115582.9%
 
0.914842.8%
 
0.4113822.6%
 
0.412982.4%
 
0.7112932.4%
 
Other values (263)3600066.8%
 
ValueCountFrequency (%) 
0.212< 0.1%
 
0.219< 0.1%
 
0.225< 0.1%
 
0.232930.5%
 
0.242540.5%
 
ValueCountFrequency (%) 
5.011< 0.1%
 
4.51< 0.1%
 
4.131< 0.1%
 
4.012< 0.1%
 
41< 0.1%
 

cut
Real number (ℝ≥0)

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.90406082
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size421.3 KiB
2020-09-06T17:33:48.949602image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median4
Q35
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.116593133
Coefficient of variation (CV)0.2860081295
Kurtosis-0.3979206693
Mean3.90406082
Median Absolute Deviation (MAD)1
Skewness-0.7171326869
Sum210546
Variance1.246780224
MonotocityNot monotonic
2020-09-06T17:33:49.083519image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=5)
ValueCountFrequency (%) 
52154640.0%
 
41378825.6%
 
31208222.4%
 
249049.1%
 
116103.0%
 
ValueCountFrequency (%) 
116103.0%
 
249049.1%
 
31208222.4%
 
41378825.6%
 
52154640.0%
 
ValueCountFrequency (%) 
52154640.0%
 
41378825.6%
 
31208222.4%
 
249049.1%
 
116103.0%
 

color
Real number (ℝ≥0)

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.594103467
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Memory size421.3 KiB
2020-09-06T17:33:49.221434image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q35
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.701080135
Coefficient of variation (CV)0.4732974858
Kurtosis-0.8667626966
Mean3.594103467
Median Absolute Deviation (MAD)1
Skewness0.1894017457
Sum193830
Variance2.893673625
MonotocityNot monotonic
2020-09-06T17:33:49.342360image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
41128820.9%
 
2979618.2%
 
3954117.7%
 
5830415.4%
 
1677412.6%
 
6542010.1%
 
728075.2%
 
ValueCountFrequency (%) 
1677412.6%
 
2979618.2%
 
3954117.7%
 
41128820.9%
 
5830415.4%
 
ValueCountFrequency (%) 
728075.2%
 
6542010.1%
 
5830415.4%
 
41128820.9%
 
3954117.7%
 

clarity
Real number (ℝ≥0)

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.949119229
Minimum2
Maximum9
Zeros0
Zeros (%)0.0%
Memory size421.3 KiB
2020-09-06T17:33:49.495266image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3
Q15
median6
Q37
95-th percentile8
Maximum9
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.647180648
Coefficient of variation (CV)0.2768780696
Kurtosis-0.3945409522
Mean5.949119229
Median Absolute Deviation (MAD)1
Skewness-0.5516774328
Sum320836
Variance2.713204086
MonotocityNot monotonic
2020-09-06T17:33:49.630184image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%) 
71306524.2%
 
61225622.7%
 
8919317.0%
 
5816715.1%
 
450639.4%
 
336556.8%
 
217903.3%
 
97411.4%
 
ValueCountFrequency (%) 
217903.3%
 
336556.8%
 
450639.4%
 
5816715.1%
 
61225622.7%
 
ValueCountFrequency (%) 
97411.4%
 
8919317.0%
 
71306524.2%
 
61225622.7%
 
5816715.1%
 

depth
Real number (ℝ≥0)

Distinct184
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean61.74932505
Minimum43
Maximum79
Zeros0
Zeros (%)0.0%
Memory size421.3 KiB
2020-09-06T17:33:49.813071image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum43
5-th percentile59.3
Q161
median61.8
Q362.5
95-th percentile63.8
Maximum79
Range36
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation1.432710583
Coefficient of variation (CV)0.02320204443
Kurtosis5.738791716
Mean61.74932505
Median Absolute Deviation (MAD)0.7
Skewness-0.08216973706
Sum3330141.1
Variance2.052659616
MonotocityNot monotonic
2020-09-06T17:33:50.039930image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
6222394.2%
 
61.921634.0%
 
61.820773.9%
 
62.220393.8%
 
62.120173.7%
 
61.619553.6%
 
62.319403.6%
 
61.719043.5%
 
62.417913.3%
 
61.517183.2%
 
Other values (174)3408763.2%
 
ValueCountFrequency (%) 
432< 0.1%
 
441< 0.1%
 
50.81< 0.1%
 
511< 0.1%
 
52.21< 0.1%
 
ValueCountFrequency (%) 
792< 0.1%
 
78.21< 0.1%
 
73.61< 0.1%
 
72.91< 0.1%
 
72.21< 0.1%
 

table
Real number (ℝ≥0)

Distinct127
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean57.45732802
Minimum43
Maximum95
Zeros0
Zeros (%)0.0%
Memory size421.3 KiB
2020-09-06T17:33:50.239810image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum43
5-th percentile54
Q156
median57
Q359
95-th percentile61
Maximum95
Range52
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.234577995
Coefficient of variation (CV)0.03889108791
Kurtosis2.80171247
Mean57.45732802
Median Absolute Deviation (MAD)1
Skewness0.7968352098
Sum3098673.7
Variance4.993338815
MonotocityNot monotonic
2020-09-06T17:33:50.421698image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
56987918.3%
 
57972218.0%
 
58836815.5%
 
59657012.2%
 
55626611.6%
 
6042417.9%
 
5425944.8%
 
6122824.2%
 
6212732.4%
 
635881.1%
 
Other values (117)21474.0%
 
ValueCountFrequency (%) 
431< 0.1%
 
441< 0.1%
 
492< 0.1%
 
502< 0.1%
 
50.11< 0.1%
 
ValueCountFrequency (%) 
951< 0.1%
 
791< 0.1%
 
761< 0.1%
 
734< 0.1%
 
711< 0.1%
 

price
Real number (ℝ≥0)

HIGH CORRELATION

Distinct11602
Distinct (%)21.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3933.054942
Minimum326
Maximum18823
Zeros0
Zeros (%)0.0%
Memory size210.7 KiB
2020-09-06T17:33:50.606584image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum326
5-th percentile544
Q1950
median2401
Q35325
95-th percentile13108.1
Maximum18823
Range18497
Interquartile range (IQR)4375

Descriptive statistics

Standard deviation3989.628569
Coefficient of variation (CV)1.014384144
Kurtosis2.177188035
Mean3933.054942
Median Absolute Deviation (MAD)1670
Skewness1.618288043
Sum212109653
Variance15917136.12
MonotocityNot monotonic
2020-09-06T17:33:50.793469image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
6051320.2%
 
8021270.2%
 
6251260.2%
 
8281240.2%
 
7761240.2%
 
6981210.2%
 
7891210.2%
 
5441200.2%
 
6661140.2%
 
5521130.2%
 
Other values (11592)5270897.7%
 
ValueCountFrequency (%) 
3262< 0.1%
 
3271< 0.1%
 
3341< 0.1%
 
3351< 0.1%
 
3362< 0.1%
 
ValueCountFrequency (%) 
188231< 0.1%
 
188181< 0.1%
 
188061< 0.1%
 
188041< 0.1%
 
188031< 0.1%
 

x
Real number (ℝ≥0)

HIGH CORRELATION

Distinct554
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.731236232
Minimum0
Maximum10.74
Zeros8
Zeros (%)< 0.1%
Memory size421.3 KiB
2020-09-06T17:33:51.052310image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.29
Q14.71
median5.7
Q36.54
95-th percentile7.66
Maximum10.74
Range10.74
Interquartile range (IQR)1.83

Descriptive statistics

Standard deviation1.121807157
Coefficient of variation (CV)0.195735634
Kurtosis-0.6183150452
Mean5.731236232
Median Absolute Deviation (MAD)0.93
Skewness0.3785630303
Sum309085.57
Variance1.258451299
MonotocityNot monotonic
2020-09-06T17:33:51.276171image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4.374480.8%
 
4.344370.8%
 
4.334290.8%
 
4.384280.8%
 
4.324250.8%
 
4.354070.8%
 
4.393880.7%
 
4.313870.7%
 
4.363860.7%
 
4.43730.7%
 
Other values (544)4982292.4%
 
ValueCountFrequency (%) 
08< 0.1%
 
3.732< 0.1%
 
3.741< 0.1%
 
3.761< 0.1%
 
3.771< 0.1%
 
ValueCountFrequency (%) 
10.741< 0.1%
 
10.231< 0.1%
 
10.141< 0.1%
 
10.021< 0.1%
 
10.011< 0.1%
 

y
Real number (ℝ≥0)

HIGH CORRELATION

Distinct552
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.734600593
Minimum0
Maximum58.9
Zeros7
Zeros (%)< 0.1%
Memory size421.3 KiB
2020-09-06T17:33:51.488042image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.3
Q14.72
median5.71
Q36.54
95-th percentile7.65
Maximum58.9
Range58.9
Interquartile range (IQR)1.82

Descriptive statistics

Standard deviation1.142184369
Coefficient of variation (CV)0.1991741797
Kurtosis91.21490696
Mean5.734600593
Median Absolute Deviation (MAD)0.92
Skewness2.434172149
Sum309267.01
Variance1.304585133
MonotocityNot monotonic
2020-09-06T17:33:51.695915image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
4.344370.8%
 
4.374350.8%
 
4.354250.8%
 
4.334210.8%
 
4.324140.8%
 
4.394070.8%
 
4.384060.8%
 
4.313860.7%
 
4.43860.7%
 
4.413840.7%
 
Other values (542)4982992.4%
 
ValueCountFrequency (%) 
07< 0.1%
 
3.681< 0.1%
 
3.712< 0.1%
 
3.721< 0.1%
 
3.731< 0.1%
 
ValueCountFrequency (%) 
58.91< 0.1%
 
31.81< 0.1%
 
10.541< 0.1%
 
10.161< 0.1%
 
10.11< 0.1%
 

z
Real number (ℝ≥0)

HIGH CORRELATION

Distinct375
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.538776377
Minimum0
Maximum31.8
Zeros20
Zeros (%)< 0.1%
Memory size421.3 KiB
2020-09-06T17:33:51.903785image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.65
Q12.91
median3.53
Q34.04
95-th percentile4.73
Maximum31.8
Range31.8
Interquartile range (IQR)1.13

Descriptive statistics

Standard deviation0.705729344
Coefficient of variation (CV)0.1994275051
Kurtosis47.08679971
Mean3.538776377
Median Absolute Deviation (MAD)0.57
Skewness1.522387342
Sum190846.21
Variance0.498053907
MonotocityNot monotonic
2020-09-06T17:33:52.114656image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2.77671.4%
 
2.697481.4%
 
2.717381.4%
 
2.687301.4%
 
2.726971.3%
 
2.676491.2%
 
2.736121.1%
 
2.665551.0%
 
2.745471.0%
 
4.025381.0%
 
Other values (365)4734987.8%
 
ValueCountFrequency (%) 
020< 0.1%
 
1.071< 0.1%
 
1.411< 0.1%
 
1.531< 0.1%
 
2.061< 0.1%
 
ValueCountFrequency (%) 
31.81< 0.1%
 
8.061< 0.1%
 
6.981< 0.1%
 
6.721< 0.1%
 
6.431< 0.1%
 

Interactions

2020-09-06T17:33:27.966489image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:28.261308image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:28.455188image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:28.634080image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:28.831958image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:29.019843image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:29.192738image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:29.377623image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:29.550516image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:29.730406image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:29.930283image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:30.157143image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:30.401994image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:30.598873image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:31.021613image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:31.230484image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:31.423366image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:31.645230image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:31.850105image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:32.051981image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:32.248858image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:32.425749image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:32.619630image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:32.802519image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:33.025382image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:33.249249image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:33.485099image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:33.710961image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:33.914835image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:34.112715image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:34.299600image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:34.484485image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:34.680364image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:34.867251image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:35.072125image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:35.273002image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:35.467883image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:35.676753image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:35.908610image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:36.125479image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:36.328354image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:36.524234image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:36.731107image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:36.919991image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:37.113870image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:37.303755image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:37.485641image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:37.681520image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:37.863411image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:38.061289image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:38.379094image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:38.572976image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:38.786842image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:38.957738image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:39.140627image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:39.325511image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:39.493407image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:39.678294image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:39.858186image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:40.026083image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:40.192979image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:40.370872image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:40.572747image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:40.763630image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:40.961508image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:41.159386image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:41.341277image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:41.543151image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:41.737031image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:41.923916image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:42.107804image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:42.272702image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:42.455590image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:42.632482image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:42.812371image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:42.988261image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:43.152163image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:43.329055image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:43.498948image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:43.668846image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:43.837742image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:44.008635image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:44.195522image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:44.369415image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:44.554302image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:44.739188image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:44.913081image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:45.102962image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:45.278857image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:45.465742image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:45.654625image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:45.834515image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:46.040386image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:46.222277image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:46.429148image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:46.796924image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:46.969818image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:47.152705image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:47.325599image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:47.505489image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-09-06T17:33:52.300542image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-09-06T17:33:52.536400image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-09-06T17:33:52.783246image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-09-06T17:33:53.014105image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-09-06T17:33:47.842282image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-09-06T17:33:48.207057image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

caratcutcolorclaritydepthtablepricexyz
00.2352861.555.03263.953.982.43
10.2142759.861.03263.893.842.31
20.2322556.965.03274.054.072.31
30.2946662.458.03344.204.232.63
40.3127863.358.03354.344.352.75
50.2437462.857.03363.943.962.48
60.2436362.357.03363.953.982.47
70.2635761.955.03374.074.112.53
80.2212665.161.03373.873.782.49
90.2335559.461.03384.004.052.39

Last rows

caratcutcolorclaritydepthtablepricexyz
539200.7142760.555.027565.795.743.49
539210.7143759.862.027565.745.733.43
539220.7032660.559.027575.715.763.47
539230.7032661.259.027575.695.723.49
539240.7241762.759.027575.695.733.58
539250.7251760.857.027575.755.763.50
539260.7221763.155.027575.695.753.61
539270.7031762.860.027575.665.683.56
539280.8645861.058.027576.156.123.74
539290.7551862.255.027575.835.873.64

Duplicate rows

Most frequent

caratcutcolorclaritydepthtablepricexyzcount
840.7954762.357.028985.905.853.665
00.3027563.457.03944.234.262.692
10.3034663.055.05264.294.312.712
20.3037563.457.05064.264.232.692
30.3041762.258.07094.314.282.672
40.3054262.155.08634.324.352.692
50.3054663.055.06754.314.292.712
60.3055762.257.04504.264.292.662
70.3055762.257.04504.274.282.662
80.3121763.556.05714.294.312.732